Search CORE

915 research outputs found

Synthèse des conférences du congrès de l\u27ABF Bibliothèques et mémoire

Author: Bodin Bruno
Publication venue
Publication date
Field of study

Bibliothèque numérique de l'enssib

Timing analysis of synchronous programs using WCRT Algebra: Scalability through abstraction

Author: Bodin Bruno
Mendler Michael
Roop Partha S
Wang JiaJie
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/10/2017
Field of study

Edinburgh Research Explorer

Compositional Timing-Aware Semantics for Synchronous Programming

Author: Aguado Joaquin
Bodin Bruno
Mendler Michael
Roop Partha S
Wang JiaJie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2018
Field of study

Edinburgh Research Explorer

Automatic matching of legacy code to heterogeneous APIs: An idiomatic approach

Author: Bodin Bruno
Dubach Christophe
Ginsbach Philip
O'Boyle Michael
Remmelg Toomas
Steuwer Michel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/03/2018
Field of study

Heterogeneous accelerators often disappoint. They provide the prospect of great performance, but only deliver it when using vendor specific optimized libraries or domain specific languages. This requires considerable legacy code modifications, hindering the adoption of heterogeneous computing. This paper develops a novel approach to automatically detect opportunities for accelerator exploitation. We focus on calculations that are well supported by established APIs: sparse and dense linear algebra, stencil codes and generalized reductions and histograms. We call them idioms and use a custom constraint-based Idiom Description Language (IDL) to discover them within user code. Detected idioms are then mapped to BLAS libraries, cuSPARSE and clSPARSE and two DSLs: Halide and Lift. We implemented the approach in LLVM and evaluated it on the NAS and Parboil sequential C/C++ benchmarks, where we detect 60 idiom instances. In those cases where idioms are a significant part of the sequential execution time, we generate code that achieves 1.26× to over 20× speedup on integrated and external GPUs

Edinburgh Research Explorer

Enlighten

A Novel WCET semantics of Synchronous Programs

Author: Bodin Bruno
Mendler Michael
Roop Partha S
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/08/2016
Field of study

Edinburgh Research Explorer

High-level synthesis of functional patterns with Lift

Author: Bodin Bruno
Dubach Christophe
Kristien Martin
Steuwer Michel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Crossref

Edinburgh Research Explorer

SimBench: A Portable Benchmarking Methodology for Full-System Simulators

Author: Bodin Bruno
Franke Bjoern
Spink Tom
Wagstaff Harry
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/07/2017
Field of study

We acknowledge funding by the EPSRC grant PAMELA EP/K008730/1.Full-system simulators are increasingly finding their way into the consumer space for the purposes of backwards compatibility and hardware emulation (e.g. for games consoles). For such compute-intensive applications simulation performance is paramount. In this paper we argue that existing benchmarksuites such as SPEC CPU2006, originally designed for architecture and compiler performance evaluation, are not well suited for the identification of performance bottlenecks in full-system simulators. While their large, complex workloads provide an indication as to the performance of the simulator on ‘real-world’ workloads, this does not give any indication of why a particular simulator might run an application faster or slower than another. In this paper we present SimBench, an extensive suite of targeted micro-benchmarks designed to run bare-metal on a fullsystem simulator. SimBench exercises dynamic binary translation (DBT) performance, interrupt and exception handling, memoryaccess performance, I/O and other performance-sensitive areas. SimBench is cross-platform benchmarking framework and can be retargeted to new architectures with minimal effort. For several simulators, including QEMU, Gem5 and SimIt-ARM, and targeting ARM and Intel x86 architectures, we demonstrate that SimBench is capable of accurately pinpointing and explaining real-world performance anomalies, which are largely obfuscated by existing application-oriented benchmarks.Postprin

Crossref

Edinburgh Research Explorer

University of St. Andrews - Pure

St Andrews Research Repository

Optimal and fast throughput evaluation of CSDF

Author: Bodin Bruno
Dinechin Benoît Dupont de
Munier-Kordon Alix
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

International audienceThe Synchronous Dataow Graph (SDFG) and Cyclo-Static Dataow Graph (CSDFG) are two well-known models, used practically by industry for many years, and for which there is a large number of analysis techniques. Yet, basic problems such as the throughput computation or the liveness evaluation are not well solved, and their complexity is still unknown. In this paper, we propose K-Iter, an iterative algorithm based on K-periodic scheduling to compute the throughput of a CSDFG. By using this technique, we are able to compute in less than a minute the throughput of industry applications for which no result was available before

Edinburgh Research Explorer

Fast and Efficient Dataflow Graph Generation

Author: Bodin Bruno
Delosme Jean-Marc
Lesparre Youen
Munier-Kordon Alix
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

International audienceDataflow modeling is a highly regarded method for the design of embedded systems. Measuring the performance of the associated analysis and compilation tools requires an efficient dataflow graph generator. This paper presents a new graph generator for Phased Computation Graphs (PCG), which augment Cyclo-Static Dataflow Graphs with both initial phases and thresholds. A sufficient condition of liveness is first extended to the PCG model. The determination of initial conditions minimizing the total amount of initial data in the channels and ensuring liveness can then be expressed using Integer Linear Programming. This contribution and other improvements of previous works are incorporated in Turbine, a new dataflow graph generator. Its effectiveness is demonstrated experimentally by comparing it to two existing generators, DFTools and SDF3

HAL Evry

Crossref

Edinburgh Research Explorer

Automatic Parameter Tuning of Motion Planning Algorithms

Author: Bodin Bruno
Cano Reyes Jose
Nagarajan Vijayanand
O'Boyle Michael
Yang Yiming
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Motion planning algorithms attempt to find a good compromise between planning time and quality of solution. Due to their heuristic nature, they are typically configured with several parameters. In this paper we demonstrate that, in many scenarios, the widely used default parameter values are not ideal. However, finding the best parameters to optimise some metric(s) is not trivial because the size of the parameter space can be large. We evaluate and compare the efficiency of four different methods (i.e. random sampling, AUC-Bandit, random forest, and bayesian optimisation) to tune the parameters of two motion planning algorithms, BKPIECE and RRT-connect. We present a table-top-reaching scenario where the seven degrees-of-freedom KUKA LWR robotic arm has to move from an initial to a goal pose in the presence of several objects in the environment. We show that the best methods for BKPIECE (AUC-Bandit) and RRT-Connect (random forest) improve the performance by 4.5x and 1.26x on average respectively. Then, we generate a set of random scenarios of increasing complexity, and we observe that optimal parameters found in simple environments perform well in more complex scenarios. Finally, we find that the time required to evaluate parameter configurations can be reduced by more than 2/3 with low error. Overall, our results demonstrate that for a variety of motion planning problems it is possible to find solutions that significantly improve the performance over default configurations while requiring very reasonable computation times

Edinburgh Research Explorer

Enlighten